Using Syntactic Dependencies to Solve Coreferences

نویسندگان

  • Marcus Stamborg
  • Dennis Medved
  • Peter Exner
  • Pierre Nugues
چکیده

This paper describes the structure of the LTH coreference solver used in the closed track of the CoNLL 2012 shared task (Pradhan et al., 2012). The solver core is a mention classifier that uses Soon et al. (2001)’s algorithm and features extracted from the dependency graphs of the sentences. This system builds on Björkelund and Nugues (2011)’s solver that we extended so that it can be applied to the three languages of the task: English, Chinese, and Arabic. We designed a new mention detection module that removes pleonastic pronouns, prunes constituents, and recovers mentions when they do not match exactly a noun phrase. We carefully redesigned the features so that they reflect more complex linguistic phenomena as well as discourse properties. Finally, we introduced a minimal cluster model grounded in the first mention of an entity. We optimized the feature sets for the three languages: We carried out an extensive evaluation of pairs of features and we complemented the single features with associations that improved the CoNLL score. We obtained the respective scores of 59.57, 56.62, and 48.25 on English, Chinese, and Arabic on the development set, 59.36, 56.85, and 49.43 on the test set, and the combined official score of 55.21.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Integration of Dependency Relation Classification and Semantic Role Labeling Using Bilayer Maximum Entropy Markov Models

This paper describes a system to solve the joint learning of syntactic and semantic dependencies. An directed graphical model is put forward to integrate dependency relation classification and semantic role labeling. We present a bilayer directed graph to express probabilistic relationships between syntactic and semantic relations. Maximum Entropy Markov Models are implemented to estimate condi...

متن کامل

Graphical Annotation for Syntax-Semantics Mapping

A potential work item (PWI) for ISO standard (MAP) about linguistic annotation concerning syntax-semantics mapping is discussed. MAP is a framework for graphical linguistic annotation to specify a mapping (set of combinations) between possible syntactic and semantic structures of the annotated linguistic data. Just like a UML diagram, a MAP diagram is formal, in the sense that it accurately spe...

متن کامل

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

Modularizing Codescriptive Grammars for Eecient Parsing Modularizing Codescriptive Grammars for Eecient Parsing

Gehh ort zum Antragsabschnitt: 15.8 Nichtsyntaktische Information f ur die se-mantische Auswertung Die vorliegende Arbeit wurde im Rahmen des Verbundvorhabens Verbmobil vom Bundesministerium f ur Bildung, Wissenschaft, Forschung und Technologie (BMBF) unter dem FF orderkennzeichen 01 IV 101 K/1 geff ordert. Die Verantwor-tung f ur den Inhalt dieser Arbeit liegt bei den Autoren. Abstract Uniicat...

متن کامل

Coreference Resolution for Swedish and German using Distant Supervision

Coreference resolution is the identification of phrases that refer to the same entity in a text. Current techniques to solve coreferences use machine-learning algorithms, which require large annotated data sets. Such annotated resources are not available for most languages today. In this paper, we describe a method for solving coreferences for Swedish and German using distant supervision that d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012